On the Policy Improvement Algorithm in Continuous Time
نویسنده
چکیده
We develop a general approach to the Policy Improvement Algorithm (PIA) for stochastic control problems for continuous-time processes. The main results assume only that the controls lie in a compact metric space and give general sufficient conditions for the PIA to be well-defined and converge in continuous time (i.e. without time discretisation). It emerges that the natural context for the PIA in continuous time is weak stochastic control. We give examples of control problems demonstrating the need for the weak formulation as well as diffusion-based classes of problems where the PIA in continuous time is applicable.
منابع مشابه
Operation Scheduling of MGs Based on Deep Reinforcement Learning Algorithm
: In this paper, the operation scheduling of Microgrids (MGs), including Distributed Energy Resources (DERs) and Energy Storage Systems (ESSs), is proposed using a Deep Reinforcement Learning (DRL) based approach. Due to the dynamic characteristic of the problem, it firstly is formulated as a Markov Decision Process (MDP). Next, Deep Deterministic Policy Gradient (DDPG) algorithm is presented t...
متن کاملEnhancing Basic Metal Industry Global Competitiveness Through Total Quality Management, Supply Chain Management & Just- In -Time
The selection and implantation of sufficient and appropriate continuous improvement strategy are the key success factors for improving firm performance and enhancement of competitive advantage on manufacturing industries. As a result special role are given to Continuous improvement programs such as Supply Chain Management (SCM), Six-Sigma, Total Quality Management (TQM), Kaizen, Just-in-Time (J...
متن کاملقیمت گذاری و کنترل موجودی به صورت توام برای کالاهای فاسدشدنی با در نظر گرفتن هزینه کمبود به صورت پس افت پاره ای
Determining the optimal selling price and inventory control policy for deteriorating items is one of the important issues in academic and industrial researches. In this paper, a joint pricing and inventory control model for deteriorating items is considered. The demand rate is known, continuous and functions of price and time. Shortages are allowed and partially backlogged, where the backloggin...
متن کاملIntegrated JIT Lot-Splitting Model with Setup Time Reduction for Different Delivery Policy using PSO Algorithm
This article develops an integrated JIT lot-splitting model for a single supplier and a single buyer. In this model we consider reduction of setup time, and the optimal lot size are obtained due to reduced setup time in the context of joint optimization for both buyer and supplier, under deterministic condition with a single product. Two cases are discussed: Single Delivery (SD) case, and Multi...
متن کاملقیمتگذاری و کنترل موجودی به صورت توام برای کالاهای فاسدشدنی با در نظر گرفتن هزینه کمبود به صورت پسافت پارهای
Determining the optimal selling price and inventory control policy for deteriorating items is one of the important issues in academic and industrial researches. In this paper, a joint pricing and inventory control model for deteriorating items is considered. The demand rate is known, continuous and functions of price and time. Shortages are allowed and partially backlogged, where the backlogg...
متن کامل